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Abstract 

It is shown that bandwidth estimation in packet networks can be viewed in terms of min-plus Hnear system theory. The 
available bandwidth of a link or complete path is expressed in terms of a service curve, which is a function that appears in the 
network calculus to express the service available to a traffic flow. The service curve is estimated based on measurements of a 
sequence of probing packets or passive measurements of a sample path of arrivals. It is shown that existing bandwidth estimation 
methods can be derived in the min-plus algebra of the network calculus, thus providing further mathematical justification for these 
methods. Principal difficulties of estimating available bandwidth from measurement of network probes are related to potential 
QQ non-linearities of the underlying network. When networks are viewed as systems that operate either in a linear or in a non-linear 
regime, it is argued that probing schemes extract the most information at a point when the network crosses from a linear to a 
non-linear regime. Experiments on the Emulab testbed at the University of Utah evaluate the robustness of the system theoretic 
interpretation of networks in practice. Multi-node experiments evaluate how well the convolution operation of the min-plus algebra 
^ provides estimates for the available bandwidth of a path from estimates of individual links. 
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2^ I. Introduction 

^ The benefits of knowing how much network bandwidth is available to an application has motivated the development 



of techniques that infer bandwidth availability from traffic measurements Q' 03' GZl' GEI' I®' EH' 113' 



I p5j , [39], 1 40 1, ||45j|. With a large number of methods available and much empirical experience gained, recently 
^ an increasing effort has been put towards improving the theoretic understanding of measurement based estimation of 



available bandwidth, e.g., ||6|, ||24|, ||33|, ||46 1. 



This paper presents a new foundational approach to reason about available bandwidth estimation as the analysis of a 

O min-plus linear system. Min-plus linear system theory has provided the mathematical underpinning for the deterministic 

y—{ network calculus ||9|, ||29|. We will use min-plus system theory to explain how bandwidth estimation methods infer 

O information about a network and find bandwidth estimation methods that can extract the most information from a 
OO 

^ network. Some key difficulties encountered when measuring available bandwidth become evident in a system theoretic 
• • view. 

• ^ We view bandwidth estimation as the problem of determining unknown functions that describe the available band- 
^ width based on measurements of a sequence of probing packets or passive measurements of a sample path of arrivals. 
d These functions correspond to the service curves that appear in the network calculus [11], where they are used to express 
the available service at a network link or an end-to-end path. Working within the context of the network calculus, we 
can apply a result that allows us to compute the service curve of a network path from service curves of the links of 
the path. This is done by applying the convolution operator of the min-plus algebra ||9|, |29|. We explore how well 



the convolution of the available bandwidth of multiple links, expressed as service curves, can describe the available 
bandwidth of an end-to-end path. 

Our formulation of available bandwidth estimation in min-plus linear system theory reveals that the underlying prob- 
lem is intrinsically hard, requiring the solution to a maximin optimization problem. The optimization problem becomes 
more tractable when the network satisfies the property of 'min-plus linearity'. We show that some existing estimation 
techniques can be accurately characterized if we interpret them as analyzing a network with linear- input-output relation- 
ships. The discovery of an implicit assumption of min-plus linearity in existing measurement methods is seemingly at 
odds with empirical evidence that these methods have been successfully applied in networks that do not satisfy linearity. 
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For example, even a single FIFO link violates the requirements of min-plus linearity. We resolve this apparent contra- 
diction by showing that some networks can be decomposed into disjoint min-plus linear and non-linear regions. These 
networks behave as a min-plus linear system at low load, and become non-linear if the load exceeds a certain threshold. 
The crossing of the linear and non-linear regions marks the point where the available bandwidth can be observed. 

The arguments in this paper draw from known relationships between linear system theory and the network calculus. 
The success in describing relatively complex probing schemes using min-plus algebra and the ability to concatenate 
the available bandwidths of multiple links using the min-plus convolution hints at a possibly stronger link between 
bandwidth estimation and network calculus. 

The assumptions in this paper on network and traffic characteristics are analogous to those in most papers on band- 
width estimation techniques (see Section [II]). The available bandwidth is represented by a random process, where the 
source of randomness is the variability of network traffic. A major assumption is that the time scale of network mea- 
surements is small compared to the time scale at which characteristics of network traffic or network links change. This 
assumption is not justified when properties of a network link vary on short time scales, e.g., on wireless transmission 
channels with random noise. Consequently, such networks are not adequately described in our min-plus system theoretic 
formulation. 

The objective of this paper is to offer an alternative interpretation for bandwidth estimation, that potentially enables 
the development of improved bandwidth estimation schemes. We previously mentioned that the convolution operator in 
the min-plus algebra can be exploited to compute bandwidth estimates for end-to-end paths. Additionally, by general- 
izing the available bandwidth in terms of service curves we can express multiple data rates at different time scales. This 
makes it possible to distinguish a short-term reduction of the data rate due to temporary link congestion from the long- 
term utilization of a link or a path. While we discuss and evaluate implementations of bandwidth probing schemes in 
measurement experiments on a testbed network, we emphasize that our objective is a validation of the system theoretic 
interpretation of these methods, and not an empirical comparison of existing probing schemes. 

The remainder of this paper is structured as follows. In Section lllj we discuss bandwidth estimation methods and 



other related work. In Section III we review the min-plus linear system interpretation of the deterministic network 



calculus. In Section IV we formulate bandwidth estimation as the solution to an inversion problem in min-plus algebra. 



In Section [V| we derive solutions to compute the inversion, and relate them to probing schemes from the literature. In 



Section^!! we justify how these probing schemes can be applied in networks that are not min-plus linear. In Section VII 



we present measurement experiments of probing schemes suggested by the min-plus system theoretic concepts from 



this paper. We present brief conclusions in Section VIII 



II. Available Bandwidth Estimation Techniques 

The goal of bandwidth estimation is to infer from measurements a reliable estimate of the unused capacity at a multi- 
access link, a single switch, or a network path. The available bandwidth of a network link i in a time interval [t,t + t) 



can be specified as |45 1 



1 



t+T 



ai{t,t + T) = - / Ci{x) - Xi{x)dx , 

T Jt 

where Ci{t) and Aj(t) are the capacity and total traffic, respectively, on link i at time t. We note that individual 
definitions of available bandwidth used in the literature may deviate from the above definition. It is generally assumed 
that link capacities have a constant rate, i.e., Cj(x) = Cj. Then, the available bandwidth can be interpreted as a random 
process, where the randomness stems from the variability of network traffic. 

If available bandwidth estimates for single links are available, the available bandwidth of an end-to-end network path 



with H links is computed as |21 1 



a{t,t + T)= mill ai{t,t + T). (1) 

i=l,...,H 

The link at which the minimum is attained is often referred to as the tight link. Available bandwidth methods measure 
the transmission of a sequence of control (probe) packets and use the measurements to estimate or bound the available 
bandwidth. Closely related are probing schemes that seek to determine the minimum capacity along a path, referred to 
as bottleneck capacity or capacity of the narrow link. If the time scale of measurements is small compared to the time 
scale at which characteristics of network traffic changes, network traffic can be described by a deterministic function 
or even constant rate function. In this case, a single sample of the available bandwidth can be interpreted as being 
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conditioned on the state of the network; Evaluating a large number of samples corresponds to computing a conditional 
average. Under a broad set of assumptions, such as stationarity of the distribution of traffic, the conditional averages are 
computed correctly. When network characteristics change on a short time scale, e.g., a wireless channels with random 
noise, a description of traffic and link by deterministic functions is not suitable. 

Almost all proposed probing schemes perform measurements of packet pairs or packet trains. Packet pairs consist 
of two packets with a defined spacing, and packet trains consist of more than two packets. Since it was first suggested 
in 1 18 1, 1 26 1, packet pair probing has evolved significantly, and has been used for estimating the bottleneck capacity 
(e.g., Bprobe |[7|, CapProbe [25]), the available bandwidth (e.g., ABwE [38] , Spruce [45}), and the distribution of cross 
traffic [ [34} . The rationale behind these methods builds on the relation of packet dispersion and available bandwidth 
resources, i.e., packet pairs with a defined gap may be spaced out on slow or loaded links and thus carry information 
about the network path. Some techniques, e.g., |34|, |45 1 build on a model of a single link whose capacity is assumed 
to be known. 

The majority of proposed methods employ packet trains for bottleneck capacity estimation (e.g., PBM j 39 1, Cprobe |[7|, 
pathrate [14]), and for available bandwidth estimation {e.g., pathload ||20|, ^,pathvar |[22|, TOPP ||35|, PTR/IGI p7| , 
pathchirp [40], and BFind [4]). The general approach is to adaptively vary the rate of probing traffic to induce conges- 
tion in the network. A comprehensive discussion of all techniques is beyond the scope of this paper. For details and 



empirical evaluations of packet train and packet pair methods we refer to a series of available articles | |2T| , | |42| , | |43| , 
44|, ||45|. Some studies have found that packet trains provide more reliable bandwidth estimates than packet pairs f2A\, 



|32|. The wide spectrum of bandwidth estimation methods indicates the complexity of measuring available bandwidth 
in a network. In particular, the comparative evaluations of bandwidth estimation methods sometimes widely disagree in 
their conclusions on the capabilities and limitations of individual methods. 

For the purposes of this paper, the two packet train methods pathload and pathchirp are particularly relevant. Pathload 
uses a sequence of constant rate packet trains, where the transmission rate of consecutive trains is iteratively varied until 
it converges to the available bandwidth. In pathchirp, the rate is varied within a single packet train using geometrically 
decreasing inter-packet gaps. Both methods interpret increasing delays as an indication of overload, i.e. to detect if the 
probing rate exceeds the available bandwidth. 

Most estimation techniques are designed with an assumption that the network as a whole exhibits the behavior of 
a single link with constant rate fluid cross traffic. Often it is assumed that the network behaves as a single FIFO 
system |[17), ||24), |[3T|, jjsg, ||33|, l|34|, ||35|, ||36), ||40), ||45|. This is justified by the particular packet dispersion of 
FIFO systems which is matched by empirical data |f36l. It has been found that the best estimates are obtained if the 
probing traffic increases the load close to, but not beyond, an overloaded state. 

Some probing methods suggest that probing traffic should follow a Poisson process | p4| , | |28| , | [32| , | |39| , ||45j, | [50| , 
since it can benefit from the PASTA (Poisson Arrivals See Time Averages) property. Briefly, the PASTA property states 
that, under a broad set of assumptions, a Poisson arrival process observes the average state of the system. An empirical 
study f??! found that Poisson probing does not necessarily lead to improved estimates of the available bandwidth. Also, 
^ points out that in case of non-intrusive probing, Poisson probing is not justified by default and may even be inferior 
to other schemes, since it does not minimize estimation variance nor does it provably reduce inversion bias, e.g. when 
deriving quantities of interest such as available bandwidths from observations. 

A set of analytical studies | [3T| , |32|, |[33 | characterizes the dispersion of probing traffic over single hop and multi 
hop paths in terms of probing-response curves, and extracts the available bandwidth from these curves. Under the 
assumption of fluid constant rate cross-traffic probing-response curves feature a sharp bend at the available bandwidth 
that is used as criterion by some methods, e.g. TOPP |35|. The mode of operation of many other methods, e.g. the 
detection of overload by pathload, can be related to these curves [32 1. Under general bursty cross-traffic the unique 
turning point of probing-response curves diminishes, whereas it can be recovered under idealized conditions, e.g. using 
packet trains of infinite length, as shown in |pT|, |[32|, ||33|. 



An alternative approach to sending probe packets is to obtain estimates of the available bandwidth through passive 
measurements of user traffic. This is the preferred approach in measurement based admission control (MBAC), which 
seeks to determine if a network has sufficient resources to support minimal service requirements for a traffic flow or 
aggregate ||8|, | [23) . In comparison to passive measurements, probing schemes have an additional degree of freedom 
since they can control the traffic profile of probing packets. 

We note that links between network calculus and bandwidth estimation have been made before mostly in the context 



4 



Input signal 
(Arrivals) 

A(t) 




Output signal 
(Departures) 

' Dit) 



System with impulse response 
(Network with service curve) 



Fig. 1 

Linear time-invariant system and min-plus linear network. 



of MBAC |[8|, p3| , 1 47 1, | [49| . Since MBAC studies are set in a context of providing service guarantees, they generally 



seek to obtain a worst-case description of the available service or traffic, in terms of time-invariant envelope functions. 
Worst-case characterizations, even if relaxed to stochastic bounds, tend to be highly conservative. In this paper, we do 
not use envelopes to describe traffic or service. For traffic that is transmitted at a lower priority as in |23|, |[47|, the 



network calculus permits a concise description of the available bandwidth as the leftover capacity which is unused by 
higher priority traffic. Aspects of a min-plus system theoretic interpretation of available bandwidth can be found in ||2|, 
which exploits a known relationship between the Legendre transform of the backlog and the available bandwidth. 

III. Min-Plus Linear System Theory for Networks 

This section reviews the linear system representation of networks and introduces needed concepts and notation. We 
consider a continuous-time setting. 

Classical linear system theory deals with linear time-invariant (LTI) systems with input signal A{t) and output signal 
D{t) (see Fig. [T]l. Linear means that for any two pairs of input and output signals (Ai, Di) and (yl2, D2), any linear 
combination of input signals biAi{t) + 62^2(i) results in the linear combination of output signals hiDi{t) + b2D2{t). 
Time-invariant means that for any pair of inputs and outputs {A, D), a time-shifted input A{t — r) results in a shifted 
output - r). 

Let S{t) be the impulse response of the system, that is, the output signal generated by the system if the input signal 
is a unity (Dirac) impulse at time zero. The basic property of an LTI system is that it is completely characterized by its 
impulse response, where the output of the system is expressed as the convolution of the input signal and the impulse 
response: 

/oo 
A[T)S{t - T)dT =: A* S{t). 
-00 

A. Min-Plus Algebra in the Network Calculus 

A significant discovery of networking research from the 1990's is that networks can often be viewed as linear systems, 
when the usual algebra is replaced by a so-called min-plus algebra |3|, [9J, |[29j. In a min-plus algebra [5], addition 
is replaced by a minimum (we write infimum) and multiplication is replaced by an addition. Similar to LTI systems, 
a min-plus linear system is a system that is linear under the min-plus algebra. This means that a min-plus linear 
combination of input functions inf{6i + Ai{t), ^2 + ^2{t)} results in the corresponding linear combination of output 
signals inf{6i + Di{t), 62 + D2{t)}. In min-plus system theory, the burst function 

otherwise , 
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takes the place of the Dirac impulse function. 

Let S{t) be the impulse response, that is, the output when the input is the burst function 5{t). Any time-invariant 
min-plus linear system is completely described by its impulse response, and the output of any min-plus linear system 
can be expressed as a linear combination of the input and shifted impulse responses by 

D{t) = inf{A(r) + S{t - r)} =:A* S{t). 

r 

In analogy to LTI systems, this operation is referred to as convolution of the min-plus algebra Jslj^ If there exists a 
function S{t) such that D{t) = A* S{t) for all pairs {A, D), then it follows that the system is min-plus linear. 

The min-plus convolution shares many properties with the usual convolution, e.g., it is commutative and associative. 
The associativity of min-plus convolution is of particular importance since it implies an easy way of concatenating 
systems in series. Given a tandem of two min-plus linear systems Si (t) and 5*2 (t), the output can be computed iteratively 
as D{t) = {A* Si)* 5*2 (t) and, with associativity, D{t) = ^* (^i * 5*2) (t) holds. Generalizing, a tandem of N systems 
that are characterized by impulse responses Si, 52, ... , Sn is equivalent to a single system with impulse response 

S{t) =Si*S2*...* SN{t) . (3) 

The observation that some networks can be adequately modeled by a min-plus linear system led to the min-plus 
formulation of the network calculus |[3|, |[9|, p9| . Here, a system is a network element or entire network, input and 
output functions A and D are arrivals and departures, respectively, and the impulse response S, called the service curve, 
represents the service guarantee by a network element. Network elements that are known to be min-plus linear include 
work-conserving constant rate links {S{t) = C t, where C is the link capacity), traffic shapers iS{t) = a + pt, where 
fT is a burst size and p is a rate), and rate-latency servers = r (t — where r is a rate, d is a delay, and 

= max(x, 0)), and their concatenations. As in ||3[, |[9|, |29 1 we make the convention that functions in the min-plus 



linear system theory are non-decreasing non-negative functions that pass through the origin. 

The relevance of the network calculus as a tool for the analysis of networks results from an extension of its formal 
framework to networks that do not satisfy the conditions of min-plus linearity. Non-linear systems implement more 
complex mappings 11 of arrival to departure functions D{t) = Il{A){t). In the network calculus, these are replaced 
by linear mappings that provide bounds of the form D{t) > A* S(t) or D{t) < ^ * S{t) (['29^, pp. xviii). Here, 5 
is referred to as a lower service curve and S is referred to as an upper service curve, indicating that they are bounds 
on the available service. In a min-plus linear system, the service curve S is both an upper and a lower service curve 
{S = S_ = S), which is therefore frequently referred to as exact service curve. 

B. Legendre transform in Min-Plus Linear Systems 

In classical Unear system theory, the Fourier transform of f{t), denoted by J^f{uj), establishes a dual domain, the 
frequency domain, for analysis of LTI systems. In the frequency domain, the Fourier transform turns the convolution to 
a multiplication, that is, J^f^g{uj) = ^f{uj) ■ ^g{uj). 

In min-plus linear systems, the Legendre transform, also referred to as convex Fenchel conjugate, plays a similar role. 
The Legendre transform of a function f{t) is defined as 

Cf{r) = sup{rr - /(r)}. 

T 

Since r can be interpreted as a rate, one may view the domain established by the Legendre transform as a rate domain. 
The Legendre transform takes the min-plus convolution to an addition | [4T| , that is,|^ 

£f,g=£f + £g. (4) 

Other properties of the Legendre transform that we exploit in this paper are that, for convex functions /, we have 

CiCf) = f. (5) 

^We re-use the symbol of the operator for notational simphcity. The context makes this slight abuse of notation non-ambiguous. 
■^Whenever possible, from now on we use the shorthand notation / to mean '/(t) for all t > 0' , and C j to mean 'Cf{r) for all r > 0' . 
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Example arrival and departure function of a probe of five packets. 



In other words, a convex function / can be recovered from Cfhy reapplying the Legendre transform |41 1. In general, 
we only have 

^(-C/) < / and = convj , (6) 

where convj denotes the convex hull of /, defined as the largest convex function smaller than /. 

Another property that will be used is that the Legendre transform reverses the order of an inequality, i.e., 

f>g^Cf<Cg. (7) 

The statement is an equivalency when g is convex. Applications of the Legendre transform in the network calculus have 
been previously studied in |[2|, |[T5|, |[T6|, (37 1. 



IV. A Min-Plus Algebra Formulation of the Bandwidth Estimation Problem 

We view a network as a min-plus linear or non-linear system that converts input signals (arrivals) into output signals 
(departures) according to a fixed but unknown service curve S. The service curve of the network expresses the available 
bandwidth, which can be a constant-rate or a more complex function. Measurements of a network probe, defined as 
a sequence of at least two packets, can be characterized by an arrival function A^{t) and a departure function D^{t), 
where the functions represent the cumulative number of bits seen in the interval [0, t] and time denotes the beginning 
of the probe. We assume that the system satisfies time-invariance over the duration of a probe. This corresponds to 
an assumption stated in Section [II] that network characteristics do not change over the duration of a measurement. The 
arrival and departure functions of a probe are constructed from timestamps of the transmission and reception of packets, 
and from knowledge of the packet size. In Fig. [2] we illustrate a network probe consisting of five packets of equal size 
with fixed spacing between consecutive packets. The vertical distance between arrivals and departures is defined as the 
virtual backlog BP{t) = A^{t) — DP{t). The horizontal distance is defined as the virtual delay WP{t). 

Representing the network by a min-plus linear system, we interpret a probing scheme as trying to determine from a 
specific sample of functions Ap and an estimate of an unknown lower service S_, such that D > A* holds for all 
pairs {A, D) of arrival and departure functions. Ideally, the estimate should be a maximal S_{t), i.e., there is no other 
lower service curve larger than S{t) that satisfies the definition]^ The goal of a probing scheme is to select a probing 

^We define a partial ordering of functions such that / < g iff. f{t) < g{t) for all t. 
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pattern, i.e., a function A^, that reveals a maximal service curve. A maximal lower service curve S computed from 
and DP yields a sample of the available bandwidth. 

Putting these considerations into a problem formulation, 5 is the solution to the following optimization problem: 

MAXIMIZE S_ 

SUBJECT TO D{t) > inf^{^(r) + S{t - r)}, 

Vt > 0, for all pairs {A,D). 

This problem has the structure of a maximin optimization, a class of problems which is fundamentally hard. The 
formulation does not consider that service curves only form a partial ordering. Therefore, there may not be an optimal 
solution, but only solutions that cannot be further improved. 

The bandwidth estimation problem is easier when the network can be described by a min-plus linear system. As we 



will see in Section VI some non-linear networks, such as FIFO systems, are min-plus linear under low load conditions. 
Recalling that a system is min-plus linear if it can be described by an exact service curve, the bandwidth estimation 
problem is reduced to solving the inversion of 

D{t) = A* S{t) for all t > 0. 

If we can take a measurement of A^ and which solves the equation for S, then, due to min-plus linearity, we 
have a solution for all possible arrival and departure functions. From Section |lllj we can infer that a solution is obtained 
by using the burst function of Eq. ^ as probing pattern, i.e., AP{t) = 5{t). This follows since the service curve is 
the impulse response of a min-plus system, that is, -D^(t) = 6 * S{t) = S{t). However, sending a probe as a burst 
function is not practical, since it assumes the instantaneous transmission of an infinite sized packet sequence. While 
a burst function can be approximated by a sufficiently large back-to-back packet train, a high-volume transmission of 
probes consumes network resources and interferes with other packet traffic. In fact, the service curve of a burst function 
(or its approximation) may cause some networks that operate in a min-plus linear regime to become non-linear. The 
observation that large packet trains can lead to unreliable estimates has been noted in the literature [14|. 

In the next section, we present derivations for three bandwidth estimation methods in min-plus linear systems. We are 
able to relate two of these methods to previously proposed probing schemes. We will later discuss how these schemes 
can be applied to certain non-linear systems. 

We conclude this section with remarks on some general aspects of probing schemes and their representations in 
min-plus Unear system theory. 

• Timestamps and asynchrony of clocks: When clocks at the sender and receiver of a probing packet are perfectly 
synchronized, and the sender includes the transmission time into each probing packet, the receiver can accurately 
construct the functions A^ and D^. In practice, however, clocks are not synchronized. When clocks have a fixed offset 
(but no drift), the arrival function A^ can be viewed as being time-shifted by an unknown offset T. In the min-plus 
algebra a time-shift can be expressed by a convolution, i.e., AP{t — T) = A'p* Srit) where dxit) = d{t — T). Here, the 
convolution of arrival function and service curve becomes {A^ * 5t) * S_, which due to associativity and commutativity 
of the convolution operation, can be rewritten as Ap * {S_* 5t)- Hence, when the offset is fixed but unknown, even 
an ideal probing scheme can only compute a service curve that is a time-shifted version of the actual service curve of 
the network. Drifting clocks make the problem harder. Many bandwidth estimation schemes circumvent the problem 
of asynchronous clocks by returning probes to the sender |4|, |7|, or by only recording time differences of incoming 
probes pj) , [ [20) , | [35| , | |40J , [45 J . A moment's consideration shows that knowledge of the differences between the 
transmission and arrival of probing packets has the same limitations as dealing with an unknown clock offset T between 
the sender and receiver of probing packets. 

• Losses: Probe packets that are dropped in the network can be thought of as incurring an infinite delay. The presen- 
tation of arrival and departure functions in Fig. [2] is not well suited for accommodating packet losses. An alternative 
presentation, which expresses arrival and departure times of probe packets (on the y-axis) as a function of the sequence 
numbers (on the x-axis) can deal with packet losses more elegantly, but may appear less intuitive. Such a description of 
traffic with flipped axes leads to a dual representation of the network calculus which is based on a max-plus algebra [9] , 



129 1. 



• Packet pairs: The arrival and departure functions of a packet pair have each only three points, i.e., the origin and 
the two timestamps related to the packet pair. If it can be assumed that the service curve has a certain shape, e.g., a 
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rate-latency curve S{t) = r ■ {t — d)+, the service curve can be recovered. In the absence of such an assumption, packet 
pair methods may not be able to recover more complex service curves. This is reflected in observations that bandwidth 



estimates from packet pairs tend to be less reliable compared to packet trains if cross-traffic is bursty |21 1, 1 32 1. 

V. Min-Plus Theory of Network Probing Methods 

In this section, we derive bandwidth estimation methods as solutions to finding an unknown service curve for a 
min-plus system. For the derivations, we make a number of idealizing assumptions. First, we consider a fluid flow 
view of traffic and service. This assumption can be relaxed at the cost of additional notation. Unless stated otherwise. 



we assume that the network represents a min-plus linear system. This assumption will be relaxed in Section VI We 
generally assume that accurate timestamps for transmission and arrival of probes are feasible. If measurements only 
record time differences between events or include an unknown clock offset between sender and receiver, the computed 
service curves need to be time shifted by some constant value. 

A. Passive Measurements 

We first try to answer the question: How much information about the available bandwidth can be extracted from 
passive measurements of traffic? To provide an answer we first introduce the deconvolution operator of the min-plus 
algebra, which is defined for two functions / and g by 

f0g{t)=sup{f{t + T)-g{T)}. 

T 

The deconvolution operation is not an inverse to the convolution (g f {f * g)), however, it has aspects of such an 
inverse. This is expressed in the following duality statement from ||29|, which states that for functions /, g and h, the 
following equivalency holds 

f<g*h <^ h>f0g. (8) 

We will exploit this property to formulate the following lemma. 
Lemma 1: For two functions g and h, we have 

{{h* g) 0g) * g = h* g . ^ 

Proof: The proof makes two applications of Eq. (jSjl. Let us define h = f g and f = g * h. By definition of h 
we can conclude with Eq. (jsjl that / < g* h. 

By definition of /, we see from Eq. ([sjl that h > f g. By our definition of h, this gives us h >h. From h >h and 

f = g *hweget f > g *h. 

Combining the two statements about the relationship of / and g*h gives us / = g*h. Now, by inserting our definition 
h = f g, we obtain f = g * {f g)- Inserting our second definition f = g * h yields g * h = g * {{g * h) g). 
Reordering the expression using commutativity of the min-plus convolution completes the proof. ■ 

The lemma justifies the following passive measurement scheme. Let us denote the arrival and departure functions 
measured from a traffic trace of one or more flows by and D^. By assumption of linearity, we know that = A^^S 
holds, but the shape of S is unknown. Suppose we compute a function S from the trace as the deconvolution of the 
departures and the arrivals, i.e., we set 

S = DP0AP . (9) 

With this, we can derive as follows: 

DP = S*AP 

= {{S * AP) AP) * AP 
= {DP0AP)*AP 
= S*AP 

*We use shorthand notation f = g * hto mean 'f{t) — {g * h){t) for all t > 0'. 
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TABLE I 

Example 1: Parameters of On-Off sources. 



(a) HIGH LOAD 



Burstiness 


high 


med 


low 


Number of sources 


1 


5 


25 


Source peak rate [Mbps] 


200 


40 


8 


Total average rate [Mbps] 


20 


20 


20 


(b) LOW LOAD 


Burstiness 


high 


med 


low 


Number of sources 


1 


5 


25 


Source peak rate [Mbps] 


200 


40 


8 


Total average rate [Mbps] 


10 


10 


10 



Equality in the first line holds because of our assumption of linearity. In the second line we apply Lemma [T] The third 
line uses again the linearity assumption. In the fourth line, we insert Eq. (|9]). We can therefore conclude with Lemma[T] 
that 

DP = AP*S . (10) 
Applying the duality property from Eq. ([8]) to = * S, we obtain S > Ap. Then, with Eq. ([9]) we have 

S <S . 

Hence, by deconvolving and A^ as in Eq. the result 5 is a lower service curve, i.e., for all pairs of arrival and 

), S can completely reconstruct the departure 
function from the arrival function, we can conclude that S is the best possible estimate of the actual service curve that can 
be justified from measurements of A^ and D^, in the sense that it extracts the most information from the measurements. 
Since the above deconvolution computes the largest available bandwidth that can be justified from a given traffic trace, 
the described method will perform no worse than any existing MB AC method from the MB AC literature 

The main drawback of this method is that it can only be applied to linear networks. For networks that do not satisfy 
min-plus linearity, i.e., that can only be described by a lower service curve (D > A * S) or upper service curve 
(D < A * S), S only computes a (not useful) lower bound for an upper service curve S. As another remark, note that 
Lemma [T] does not help us with designing a probing scheme, since it does not tell us how to select the traffic A^ for the 
network probes. 

For illustration of the passive measurement scheme, we now present two numerical examples. 

Example 1: Sensitivity of Passive Measurements. We study the the sensitivity of the passive measurement method 
with respect to the burstiness of the trace, the fraction of available bandwidth that is utilized by the flows, and the length 
of the measurement period. We consider an idealized fluid flow traffic at a min-plus linear system, which is governed 
by a service curve 

S{t) = (b + rt) * {R[t-T]+) . 

The system represents a network where the input is regulated with a leaky-bucket with parameters b and r, and the 
service is described by a latency-rate service curve with delay T and rate R. We set b = 0.75 Mb, r = 25 Mbps, 
R = 100 Mbps, and T = 10 ms. 

As traffic trace, we use an arrival sample path that represents the aggregate arrivals from a set of statistically indepen- 
dent On-Off traffic sources. In the On state, each source generates traffic at a given peak rate. In the Off state, no data 
is generated. In each time slot of duration one millisecond, a source switches from the On state to the Off state with 
probability p, and from the Off state to the On state with with probability q. 

The parameters are depicted in Table|I] In the high load setting, we set p = 0.09 and q = 0.01, resulting in a total 
arrival rate of 20 Mbps. In low load, we set p = 0.19 and q = 0.01, which leads to an average total traffic rate of 
10 Mbps. We control the burstiness of the traffic by increasing the number of flows, and accordingly decrease the peak 
rate of each flow. Due to statistical multiplexing, an aggregate of multiple On-Off sources is less bursty than a single 



departure functions {A, D), we have D > A * S. Since, from Eq. ( 10 
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time [ms] time [ms] 

(a)High load, after 1 second. (b)High load, after 10 seconds. 




time [ms] time [ms] 



(c)Low load, after 1 second. (d)Low load, after 10 seconds. 

Fig. 3 

Example 1: Passive measurement of multiplexed On-Off traffic. 



flow with the same peak and average rate. In our plots burstiness levels of high, medium, and low correspond to a trace 
with 1,5, and 25 so urces. 

In Fig. 3(a)|3(d) we show the estimates of the lower service curves S obtained with the deconvolution described 
above, and compare them to the actual service curve S, indicated as a thick (red) line in each graph. The length of the 
measurement is taken over 1 second (plots on the left), 10 seconds (plots on the right). In all plots, we see that burstier 
traftic leads to better estimates of the service curve. This is expected since we know that the burstiest traffic, i.e., a burst 
impulse, can perfectly recover S (see Section IV i. For the same reason, the estimates improve when the traffic trace 
has a higher utilization of the available bandwidth. Observe that all estimates improve with increasing length of the 
evaluation period. This follows from the definition of the supremum in the min-plus deconvolution operation. 

Example 2: The Dilemma of Passive Measurements. To illustrate the limitations of passive measurements for 
bandwidth estimation, we now present as a second example an ns-2 simulation ||T| of measurements at a single node 
with capacity C. There is a propagation delay of 10 ms at the ingress link and a 10 ms delay at the egress link. The 
packet scheduling algorithm is either FIFO or Deficit Round Robin (DRR). DRR approximates a fair queuing discipline, 
which can distribute capacity equally among cross and probe traffic. The cross traffic at this link consists of CBR traffic 
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which is transmitted in 800 byte packets. The rate of cross traffic is set to half the link capacity. The traffic source for 
passive measurements is a small segment of a high-bandwidth variable bit rate video trace [12] with an average rate of 
17.1 Mbps and a peak rate of 154 Mbps. (We have used two seconds of the video trace entitled From Mars to China.) We 
evaluate the bandwidth estimation, when the link capacity is set to C = 70, 50, 30 Mbps. The resulting service curves 
are shown in Fig. [3] In each figure, the exact service curve (red line) is a latency rate service curve with delay 20 msec 
and rate (7/2. The computed estimates are indicated by a dashed line for FIFO and a solid line for DRR scheduling. 
For C = 70 Mbps, the available bandwidth is clearly underestimated. The estimates improve for (7 = 50 Mbps, where 
the video trace accounts for a larger fraction of the unused bandwidth. For C = 30 Mbps, the available bandwidth is 
estimated with high accuracy for the DRR link, but overestimated for the FIFO link. The overly optimistic estimates 
at a FIFO link occur when the variable bit rate of the video traffic overloads the link, thereby preempting cross traffic. 



An explanation for this outcome is given in Section VI where we discuss non-linearities observed in overloaded FIFO 
systems. The video trace example indicates a fundamental dilemma with passive measurements. On the one hand, if the 
traffic intensity of the measured trace is too low, the trace does not extract enough information from the network. On 
the other hand, if the traffic intensity is too high, the traffic trace may preempt other traffic, thus leading to inaccurate 
estimates. 

B. Rate Scanning 

We now consider an active probing scheme that transmits packet trains at a constant rate, but varies the rate of 
subsequent trains, e.g., such as pathload [19J, |20|. We provide a justification for this approach, which we refer to as 
rate scanning, using min-plus system theory. 

Given arrival and departure functions A and D, using the earlier definition of backlog, the maximum backlog can be 
computed as 

B^ax = sup{A{t)-D{t)}. 

t 

If the arrivals are a constant rate function, that is, A{t) = rt, and the network satisfies min-plus linearity, we can write 
Bmax as a function of r as follows: 

Bmax{r) = sup{rt - inf {rr + S{t - r)}} 

t T 

= sup{sup{r(t — r) — S(t — r)}} 

t T 

= supjrt — S{t)}. 

t 

The first line uses that output in min-plus linear systems can be characterized hy D = A* S. The second line moves 
the infimum in front of the substraction, where it becomes a supremum. The third line is simply a substitution. 



Recalling the definition of the Legendre transform from Subsection|III-B the right hand side of the last equation can 



be written as the Legendre transform of S, that is, Bmaxir) = Cs{r). This relation has been observed in [W^, \ 15 1, 



1 37 1. We now take a further step by applying the relation in the reverse transform. Due to Eq. ([5]), we have for convex 



service curves S that 

S{t) = C{Cs){t) = C-B^^At) = SUp{rt - Braax{r)} . 

r 

Thus, every convex service curve can be completely recovered by measurements of the maximum backlog Bmax- For 
service curves that are not convex one recovers, using Eq. ([6]l, a lower bound for the service curve. The interpretation 
of rate scanning is that each constant bit rate stream with rate r reveals one point Bmax{r) of the service curve in the 
Legendre domain Cs{r). If we specify a rate increment, which sets the rate increase between packet trains and a rate 
limit, which sets the maximum rate at which the network is scanned, we realize a rate scanning method that computes 
a service curve consisting of piecewise linear segments. The choice of the rate increment determines the length of 
the segments, and, in this way, the accuracy of the computed service curve. We note that rate scanning is capable of 
tracking a convex service curve up to a time where the derivative of the service curve reaches the rate limit. The higher 
the maximum rate, the more information about the service cur ve is recovered. The number of packets in a packet train 
must be large enough so that the maximum backlog can be accurately measured. 
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(c)C = 30 Mbps. 
Fig. 4 

Example 2: Passive measurement simulation with a video source. 



A criterion for picking the rate limit suggested by our derivations is to stop rate scanning when increasing the scanning 
rate does not yield an improvement of the service curve. This criterion, however, may fail when the underlying network 



is not min-plus Unear. The rate scanning method pathload 1 19 1, 1 20 1 uses an iterative procedure which varies the rate r 



of consecutive packet trains until measured delays indicate an increasing trend. In Section VI we will find that similar 
criteria can be justified to determine a rate limit in a non-linear system. 



In Fig. 5(a) we present an example of the rate scanning approach for a fluid-flow service curve with a quadratic form 
S{t) = 0.4t^. In the example, rate scamiing is performed at rates 10, 20, . . . , 80 Mbps. In Fig. 5(a) we plot the 
maximum backlog observed for each scanning rate. The function Bmax{f) is constructed by connecting the measured 
data points by lines. For rates r exceeding the rate limit we can set Bmaxif) = cxd to obtain a conservative Legendre 



transform for all rate values. In Fig. 5(b) we show the service curves that are obtained with different rate limits. The 
higher the rate limit, the more accurate the results. Decreasing the increment of the rate will improve the accuracy of 



the service curve. We point out that both the backlog plot in Fig. 5(a) and the service curves in Fig. 5(b) consist of linear 
segments. 
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C. Rate Chirps 

The need of rate scanning to measure a possibly large number of packet trains has motivated the pathchirp method 
1 40 1, where available bandwidth estimates are based on the measurement of a single packet train, with a geometrically 
decreasing inter-packet spacing. The approach takes inspiration from chirp signals in signal processing, which are 
signals whose frequencies change with time. We refer to this approach as rate chirp, since the decreased gap between 
packets corresponds to an increase of the transmission rate. We will show that a rate chirp scheme can be justified in 
min-plus system theory using properties of the Legendre transform. 

Suppose we have a lower service curve S_ satisfying D > A* S^ior all pairs (^4, D). Taking the Legendre transform 
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we obtain with the order reversing property of Eq. (|7]) and with Eq. Q, that 

(^D < C,A*s = C,A + C,s_- 

We can re-write this as 

as long as the difference Coir) — CA{r) is defined for all r. A sufficient condition is that CA{r) < oo, since it prevents 
both transforms Cd and Ca from becoming infinite at the same value of r. Another application of Eq. ^ yields 

£{Cs)<C{£d-Ca) . 
If the system is min-plus linear, that is, D = A * S, we get, 

£{Cs)=C{£d-Ca) . 

If S is also convex, then by Eq. (jsj), we have S = C{Cd — Ca)- 

This provides us with a justification for pathchirp [40] as a probing method. If we depict the transmission of a packet 
chirp as a fluid flow function, we see that it grows to an infinite rate, thus, yielding a Legendre transform that is finite 
for all rates. By measuring arrivals and departures of the chirp, denoted by yl'^^'^P and D^hrp^ compute a function 

5 by 

S{t) = Ciy£,£)chrp — CAchrp){t) . (H) 



If the network satisfies D = A* S ior all arrivals, then the right hand side of Eq. (Ill computes C{Cs)- With Eq. ([6]), 
we obtain S < S, which tells us that 5 is a lower service curve that satisfies D > A * S for any traffic with arrival 
function A and departure function D. If 5 is convex we have S = S, and we can recover the service curve exactly. 

In practical probing schemes, our fluid flow interpretation where a packet chirp can grow to an infinite rate is idealized, 
since a rate chirp cannot be transmitted faster than the data rate at the sender of probe packets. For a packet chirp that 
is transmitted in a time interval [0, t^axl ^^'^ where D is observed over an interval [0, t^axl' '^^e following adjustment 
complies with the formal requirements of our equations: 



\oo , if t>t^^ 

j^chrp ,if < t < t^^ 
^ i^) ~ \ ^i^max) ~^ ~ ^max) di i^max) : 



'"max 1 
"max ! 

''max ! 



iL u ^ ''max • 

The arrival function is simply set to oo past the last measurement. The departure function is continued at a rate that 
corresponds to its slope at the time of the last measurement. For convex service curves S, the above extensions are 
conservative. 



In Fig. 6(a) we show several rate chirps for a network probe. The rate chirp consists of a step-function which emulates 
a sequence of probing packets of 1200 bytes. The packets are transmitted at an increasing rate, starting at 10 Mbps and 
growing to 200 Mbps. The rate is increased by reducing the elapsed time between the transmission of the first bit of 
two consecutive packets, by a constant factor 7, which is called the spread factor in [40]. Larger values for 7 lead to 



shorter chirps that grow faster to the maximum rate. In Fig. 6(b) we show the service curves computed from the chirps 
in Fig. 6(a) The actual service curve is S{t) = 0.4t^, indicated as a thick (red) line in the figure. A chirp with a smaller 



spread factor 7, which transmits more packets over longer time interval, leads to better estimates of the service curve. 

VI. Bandwidth estimation in non-linear systems 
Extending bandwidth estimation to systems that are not min-plus linear, i.e., cannot be described by an exact service 



curve, raises difficult questions. First, the problem formulation of bandwidth estimation at the beginning of Section IV 
has shown that the problem has the structure of a maximin optimization. Moreover, in networks with non-linearities 
the network service available to a traffic flow may depend on the traffic transmitted by this flow. If this is the case, 
knowledge of the available bandwidth may not help with predicting network behavior. 
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In this section, we provide solutions for a class of networks that can be decomposed into disjoint min-plus linear and 
non-linear regions. These networks behave like a min-plus linear system at low load, and become non-linear when the 
traffic rate is increased beyond a threshold. In such a network, the goal of bandwidth estimation should be to determine 
the available bandwidth of the linear region. The interpretation is that the available bandwidth denotes the maximum 
additional load that the network can carry without degrading to a non-linear system. Our work is motivated by studying 
the available bandwidth at a FIFO link. While we conjecture that most networks can be adequately described by a 
system that behaves linearly at low loads, the actual scope of this class of networks remains an open problem. 

A. Non-linearity of FIFO systems 



Consider the FIFO system shown in Fig. 7(a) with capacity C. Assume that we have constant-bit rate traffic that is 
transmitted in 800 byte packets. The FIFO queue experiences (cross) traffic at a rate of Tc, and probing traffic is sent 
according to A{t) = rt. Assuming a link capacity of C = 50 Mbps and cross traffic of = 25 Mbps, we consider a 
probing rate of r = 25, 50, 75, 100 Mbps. For an ns-2 simulation of this system, Fig. |7(b)| depicts the departure function 
of the probe packets for the range of probing rates. As seen previously for passive measurements at a FIFO queue (see 
Fig. [3]), once the probing traffic exceeds the unused capacity, it preempts cross traffic and results in an overly optimistic 
estimate of the available bandwidth. Empirical observations of FIFO systems with CBR cross and probe traffic in ||36| 
suggested the following departure function: 



rt 



. r+Tc 



if r < C 
Ct, ifr>C 



(12) 



Thus, if the probing rate is above the threshold C — Vc, the capacity allocated to the probe and cross traffic is proportional 
to their respective rates. As a result, probing traffic gets more bandwidth when its rate is increased. 

We now offer a min-plus system interpretation of bandwidth estimation for the depicted FIFO scenario. Consider the 
function Sfifo{t) = (C — rc)t. From the empirical departure characterization D of a FIFO system from Eq. ( 12 1, we 
can verify that the following is satisfied for all t > 0: 



D{t) = (rt) *Sfifo, if r<C-rc 
D{t) >{rt)*Sfifo,ifr>C-rc 



(13) 



Therefore, Sfifo is an exact service curve for A{t) = rt when r < C — rc, and Sjifo is a lower service curve when the 
arrivals exceed the threshold value. In fact, Sjijo is the largest lower service curve for a FIFO system, and a solution to 
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the maximization in Section IV Any function larger than Sfifo may not be a lower service curve for rates r > C — Vc, 
indicating that a FIFO system is not min-plus Unear in this range. 

These considerations suggest to view a FIFO network as a system that is min-plus Unear at rates r < C — Vc, and 
crosses into a non-linear region when the rate exceeds the threshold. The crossing of these regions coincides with the 
point where the available bandwidth S'/,/o can be observed. 

Probing schemes that vary the rate of probe traffic can sometimes be interpreted in terms of searching for the crossover 
from a linear to a non-linear regime. In particular, the rules in pathload and pathchirp to stop measurements when 
increasing delays are observed can be justified in terms of crossing the non-linear region (at least in a FIFO system), 
since a probing rate above C — rc is the turning point when the buffer of the FIFO system fills up. In the remainder of 
this section, we address the problem of locating this crossover point using systems-theoretic arguments. 

B. Stopping Criteria 

We address the problem of determining the threshold probing rate for a system with disjoint linear and non-linear 
regions. The threshold probing rate can be interpreted as the maximum rate at which the network can be probed without 
leaving the linear region. We refer to a condition that determines the maximum probing rate as a stopping criterion. 

Non-linearity Criterion: In a min-plus linear system, the service curve is independent of the traffic intensity of the 
probe traffic. If we have obtained, under assumption of min-plus linearity, a lower service curve S from a measurement 
probe with functions {A, D), then S must be a lower service curve for any other arbitrary measurement probe {A' , D'), 
that is, D'{t) > A' * S{t) for all times t. A violation of the inequality indicates that the assumption of linearity used for 
the computation of S is false. 

A simple non-linearity test can be devised for systems where increased traffic does not result in decreased output. 
Consider a sequence of probes {Ai, Di)i=i,2,...,n> where the traffic intensity of subsequent probes is increased, that is, 
j4j_|_i > Ai. By assumption, we also have -Dj+i > Di. Each probe results in an estimate Sk{t). If there is ak for which 
Sk violates linearity for some i < k, that is, Di{t) < Ai * Sk{t) for some t, then the network is no longer in the linear 
region for the probe {Ak, Dk). 

The described criterion can be directly applied to a rate scanning approach with increasing probing rates where 
Ai{t) = rit with r^+i > r^. As an alternative, one could modify the scanning rate to perform a search for the maximum 
scanning rate in the linear region. This makes the criterion more similar to the scanning pattern in pathload. 

Applying the non-linearity criterion to a rate chirp approach is less straightforward, since there is only a single arrival 
function ^c/i«rp Generating multiple arrival functions from a single rate chirp by truncating the arrival functions merely 
produces truncated versions of the same service curve. Transmitting multiple rate chirps with different spread factors 



(see Fig. 6(a) i makes the criterion applicable, yet, it loses the main advantage of rate chirps of requiring only a single 
packet train. Thus, with only a single packet train, we are unable to justify a stopping criteria from min-plus linear 
systems theory. 

Backlog Convexity Criterion: This method is applicable to the rate scanning methods, with probing rates r E 
[ri, r2, . . . , r„] with r^+i > rj. Assume that the maximum backlog measurement is -Bmaxl?") for rate r. Recall from 



Subsection V-B that a linear system satisfies S{t) = Csn^^^it) and -BmaxC?") = J^si'f) holds for all r. This motivates a 



test for linearity that exploits properties of the Legendre transform discussed in Subsection III-B Under the assumption 
of hnearity, an estimate of the service is obtained from 



and i^maxl?") = -^^C^) holds for all r. Using Eq. (jSj) and Eq. (|6j), if there exists an r such that i3max('') is not convex, 
i.e., 

5max(r) / convB^^^(r) 

for some r, we have i?max(?') 7^ ^^l^)' ^rid, hence, the hypothesis of a linear system is dismissed. 

For systems that are linear at low probing rates and cross into a non-linear region after a threshold is reached, a 
convexity test can be easily devised for schemes that incrementally increase the probing rate. After each rate step, one 
simply performs a test for equality of -Bmax('^) and C{CB^^^){r). If -BmaxC?") / ^('^-Bmax)('^)' the system has reached 
the non-linear region and the rate scan is terminated. Otherwise, it is assumed that the system is still linear, and the 
probing rate is increased. 
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To avoid false positives and negatives in the test for equality, we suggest a heuristic that can account for variability in 
the measurements. For each value of r, we compute the difference 

AB{r) = B^,,{r) - C{CB,_){r) 

Note that AB{r) is generally positive since -Bmax('') > convB^^^C^)- When the normalized difference AB{r)/r 
exceeds a threshold value, we assume that the system is no longer in the linear region. Outliers that are due to random 
fluctuations can be eliminated by median filtering AB{r)/r before applying the threshold test. In our experiments, we 
use a threshold of a = 4 ms and perform median filtering. 

An additional issue is that, since packet trains have a certain length, the maximum backlog i?max may not be attained 
by a train. In order to apply the backlog convexity criterion to finite-length packet trains, it must be shown that the 
backlog that is created by fixed length packet trains also violates convexity once the boundary to the non-linear region 



is crossed. As an example, for FIFO systems, we obtain from Eq. ( 12 1 that the maximum backlog generated by a packet 
train of L bits is 

^ Jo, ifr<C-rc 



For all r > C — rc the second derivative of Bl^^^{r) is negative and thus B^^^{r) is strictly concave, while it is convex 
for r < C — Tc. Thus, the backlog convexity criterion can be applied for finite packet trains in this case. 

VII. Experimental Validation 

In this section, we present measurement experiments on an IP network that provide an empirical evaluation of the pro- 
posed system theoretic approach to bandwidth estimation. Specifically, we attempt to provide answers to the following 
questions: 

• How well does the described min-plus systems theory which assumes an idealized fluid-flow characterization of traffic 
and service translate in a packet based environment? 

• How robust are the available bandwidth methods to changes of the distribution of the cross traffic? 

• How well is a min-plus systems theoretical approach suitable for finding end-to-end estimates over multiple links? 
We conduct a series of measurement experiments on the Emulab network testbed at the University of Utah [48 1, where 

experiments are run on a cluster of PCs that are interconnected by a switched Ethernet network. Propagation delays 
are emulated by PCs that buffer packets in transmission. Emulab provides a realistic IP network environment, yet it 
offers a controlled lab environment where traffic and resource availability can be explicitly configured. The ability to 
precisely control network resources enables us to evaluate how well available bandwidth estimates match the configured 
availability of network resources. 

In our experiments, we take advantage of the fact that system clocks in the Emulab testbed are synchronized up 
to 1 ms. According to our discussion in Section |IVj if synchronized clocks are not available, then the service curves 
computed in this section should be interpreted as being horizontally displaced by an unknown amount. 

We have implemented the probing schemes for rate-scanning (Section V-B| ) and rate chirps (Section V-C I using the 



rude-crude traffic generator p7| . In addition, in some experiments we include for benchmark comparison the results of 
measurements using an unmodified version of the pathload software. 

We first present measurements on a dumbbell topology as shown in Fig. [8] where each node is realized by a PC of 
the Emulab network. The figure indicates the capacity and the latency of each link. Packet sizes are set to 800 bytes for 
cross traffic and 1472 bytes for probing traffic. The average data rate of the cross traffic is set to 25 Mbps. The probing 
method seeks to determine the unused capacity of the link in the center of the figure. The measurements do not address 
losses of probe traffic. In fact, when a probe packet is dropped, the measurement for this packet is ignored. 

A. Experiment 1: Rate scanning vs. Rate chirps 



We first compare the effectiveness of the Rate Scanning and Rate Chirp methods from Sections V-B and V-C in the 
dumbbell topology. We assume that CBR cross traffic is sent at a rate of 25 Mbps. 

For the rate scanning method, each packet train has 400 packets, transmitted in increments of 4 Mbps, up to at most 



60 Mbps. The stopping criterion is the backlog convexity criterion from Section VI-B with a threshold of q = 4 ms and 
a window size of = 3 for median filtering. 
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For the rate chirp method, the initial spacing of probe packets is set to a rate of 4 Mbps, where the spacing between 
subsequent packets is governed by spread factor of 7 = 1.05. The chirp is stopped once its instantaneous rate reaches 
100 Mbps, resulting in 66 packets for each chirp. The reason we let the rate chirps go up to 100 Mbps whereas the 
rate scans only go up to 60 Mbps is that data points at the end of the chirps become quite sparse due to the geometric 
increase of the chirp's rate. For rate chirps, we employ the stopping criterion proposed in |40|, which aims at finding 
the instantaneous data rate at which one way packet delays start growing due to persistent overload. (Note that an 



application of the non-linearity criterion from Section VI-B to the rate chirp method would require multiple rate chirps.) 



In Figs. 9(a) and |9(b) we present the results of 100 repeated estimates of the available bandwidth in terms of the 
computed service curves (shown as black graphs) for the rate scanning and rate chirp method, respectively. As noted 
in Section |llj each sample of the available bandwidth can be thought of being conditioned on the state of the network. 
We include, as a red graph, a rate-latency curve with the minimal delay (of approximately 2 1 msj^ and the average 
available bandwidth (25 Mbps). This curve is referred to as reference service curve and serves as an a priori bound for 
the available bandwidth computations. 

^The minimal delay consists of 20 ms propagation delay and approximately 1 ms transmission delay. 
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(a)Exponential cross traffic. (b)Pareto cross traffic. 

Fig. 10 

Experiment 2: Rate scanning with different cross traffic. 



Cross traffic 


lower bound 


upper bound 


CBR 

Exponential 
Pareto 


22.6 Mbps 

17.7 Mbps 
15.9 Mbps 


22.8 Mbps 
25.4 Mbps 
29.3 Mbps 



TABLE II 
Pathload measurements. 



A comparison of Figs. 9(a) and 9(b) shows that rate scanning provides more reliable estimates of the service curve 
than rate chirps. We note that the pathchirp method from pOl, would yield better results since it smoothes the available 
bandwidth over 1 1 estimates to deal with the variability of estimates from single rate chirps. Rate scanning and rate 
chirps perform equally in an ideal Unear time-invariant system, while while rate chirps are more susceptible to random 
noise. 

In the remaining experiments, we only consider the rate scanning method. 

B. Experiment 2: Different cross traffic distributions. 

In this experiment we evaluate the rate scanning method for different distributions of the cross traffic on the dumbbell 
topology. We consider cross traffic where interarrivals follow an exponential or Pareto distribution (with shape param- 
eter set to 1.5). All other parameters are as in Experiment 1. In particular, the average traffic rate of cross traffic is 
25 Mbps. 

In Figs. 10(a) and 10(b)[ respectively, we show the results for exponential distribution and Pareto cross traffic. The 
reference service curve is shown in red. Is is apparent that, compared to CBR cross traffic in Experiment 1 , the higher 
variance of the cross traffic results in a higher variability of the service curve estimates. At the same time, even for 
Pareto traffic, almost aU estimates of the available bandwidth provide a conservative bound for the reference service 
curve. 



In Fig. 11 we reconcile the results from Figs. 9(a) 10(a) and 10(b) in a single graph. We compute the derivatives 
of the service curves and plot the mean value averaged over the 100 estimates (with 95% confidence intervals). The 
graph also includes the derivative of the reference service curve (in red). The plot for the reference service curve shows 
a sudden increase at time 21 ms, where the service curve jumps to a rate of 25 Mbps. The derivatives of the service 
estimates for CBR, exponential, and Pareto cross traffic provide lower bounds, which become more pessimistic with 
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Results from Experiments 1-2: Derivative of service curves. 



No. bottle- 
neck links 


End-to-End 
probing 


Per-link probing 
with Eq. ^ 


2 
3 
4 


[14.3,20.5] Mbps 
[12.2, 18.2] Mbps 
[11.7,17.1] Mbps 


[15.1,22.6] Mbps 
[13.3, 19.9] Mbps 
[11.8,18.1] Mbps 



TABLE III 

Pathload measurements: Multiple Bottlenecks. 



increasing variance of the cross traffic distribution. 

As a point of reference, we now show results of the pathload application available from 1 13 1 for the same network 
and cross traffic parameters. Pathload is frequently used as a benchmark to evaluate bandwidth estimation techniques. 
The pathload application views available bandwidth as a rate and returns a range that is averaged over a time interval 
r which bounds the observed distribution of the available bandwidth. For each cross traffic type, we ran pathload 100 
times and computed the average values of the lower and upper bounds of the estimated available bandwidth range. The 



results are summarized in Table A comparison of the table with Fig. [TT] shows that the lower bounds of the min- 
plus theoretic estimation yield service curves, whose long term average rate is similar to or above the lower bound of 
pathload measurements. As expected, the variation range increases if the cross-traffic has a higher variability. 

C. Experiment 3: Multiple bottleneck links 

We now present measurements over networks with multiple bottleneck links. Fig. [12] depicts the network setup in 
Emulab with two bottleneck links. The bottleneck links have a capacity of 50 Mbps. The interarrival distribution of 
cross traffic is exponential with parameters as discussed earlier in this section. As probing scheme, we again use rate 
scanning with the backlog convexity stopping criterion. 

For each network, we compute the end-to-end service curve using two methods. In the first method, called End-to- 
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Topology with multiple bottleneck links. 
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End (E2E) Probing, we send probe traffic end-to-end over all bottleneck links. In the second method, referred to as 
Convolution we send probe traffic separately over bottleneck link and construct a service curve for each link. Then, we 
compute an end-to-end service curve using the convolution operation following Eq. Q. The convolution can be done 
efficiently in the Legendre domain, where the convolution becomes a simple addition. 

Note that in our computation of the available bandwidth over multiple links, the convolution from Eq. (j3]) replaces 
the minimum in the widely used Eq. ([T]l. For the special case that the available bandwidths of links are constant-rate 
functions the convolution over multiple links is equal to the minimum of the rates. Formally, if Si{t) = rit for all i, we 
obtain S{t) = Si* S2* ■ ■ ■* SN{t) = minj r^. Thus, the convolution expression is a true generalization of the currently 
prevailing method for composing bandwidth estimation of multiple links. 

In Figs. 13(a) - 13(c) we present the outcomes of our experiments for two, three, and four bottleneck links, respec- 
tively. As in Fig. 11 we present derivatives of the service curves. We depict the average values of 100 measurements, as 
well as the 95% confidence intervals. The reference service curve (in red) is a latency rate service curve with a delay of 
10 ms for each traversed bottleneck link and a rate equal to the average unused link capacity. We observe that the results 
of E2E probing are lower at shorter time scales. Over longer time intervals, the results of the Convolution method yields 
larger estimates. The long-term average rate of the computed service curves degrades with the number of hops. 

In Table III we show as benchmark the results of pathload measurements. We include the range of values of end-to- 
end probing, as well as the results of applying Eq. ([T]l to per-link measurements. The degradation of available bandwidth 
estimates as the number of bottleneck links is increased is similar as observed in Fig. 13 The long-term average of the 
service curve in Fig. 13 yields more optimistic results than the range of values in Table III While the results of the 
system theoretic approach for paths with multiple nodes are clearly encouraging, we caution against a generalization to 
other topologies and production networks. 
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VIII. Conclusions 

We have developed an interpretation of bandwidth estimation as a problem in min-plus linear systems, where the 
available bandwidth is represented by a service curve. Using service curves as opposed to constant-rate functions 
permits a description of bandwidth availability at different time-scales. We have related difficulties with network probing 
to non-linearities of the underlying system. By interpreting a network as a system that is min-plus linear at low loads, and 
becomes non-linear when the network load exceeds a threshold, we have argued that the crossing of the linear and non- 
linear regions marks the point where the available bandwidth can be observed. A series of measurement experiments 
showed that the min-plus linear approach to bandwidth estimation lends itself to the development of effective probing 
schemes. In particular, the min-plus convolution operator can be applied to obtain end-to-end estimates from per-link 
measurements. 

Acknowledgments 
The authors thank A. Burchard for many insights and suggestions. 

References 

[I] ns-2 network simulator, http://www.isi.edu/nsnam/ns/' 

[2] F. Agharebparast and V. C. M. Leung. Slope domain modeling and analysis of data communication networks: A network calculus comple- 
ment. In Proc. IEEE ICC, pages 591-596, June 2006. 

[3] R. Agrawal, R. L. Cruz, C. Okino, and R. Rajan. Performance bounds for flow control protocols. 7(3):3I0-323, June 1999. 

[4] A. Akella, S. Seshan, and A. Shaikh. An empirical evaluation of wide-area internet bottlenecks. In Proc. ACM IMC, pages lOl-1 14, 2003. 

[5] F. Baccelli, G. Cohen, G. J. Olsder, and J.-R Quadrat. Synchronization and Linearity: An Algebra for Discrete Event Systems. John Wiley 
& Sons Ltd., West Sussex, Great Britain, 1992. 

[6] F. Baccelli, S. Machiraoju, D. Veitch, and J. Bolot. The role of PASTA in network measurement. In Proc. ACM SIGCOMM, pages 23 1-242, 
Pisa, September 2006. 

[7] R. L. Carter and M. E. Crovella. Measuring bottleneck link speed in packet switched networks. Performance Evaluation, 27 and 28:297-3 18, 
1996. 

[8] C. Cetinkaya, V. Kanodia, and E. W. Knightly. Scalable services via egress admission control. 3(1):69-81, March 2001. 

[9] C.-S. Chang. Performance Guarantees in Communication Networks. Springer- Verlag, London, Great Britain, 2000. 

[10] R.Cruz. A calculus for network delay, parts I and II. IEEE Transactions on Information Theory, 37(iy.ll4-l41, January 1991. 

[II] R. L. Cruz. Quality of service guarantees in virtual circuit switched networks. 13(6): 1048-1056, August 1995. 

[12] G. Van der Auwera, P. T. David, and M. Reisslein. Bit rate-variability of h.264/avc frext. Technical report, Arizona State University, April 
2006. 

[13] C. Dovrolis and M. Jain. Pathload: A measurement tool for the available bandwidth of network paths. 

http://www.cc.gatech.edu/fac/Constantinos.Dovrolis/pathload.html 
[14] C. Dovrolis, P. Ramanathan, and D. Moore. What do packet dispersion techniques measure? In Proc. IEEE INFOCOM, pages 905-914, 

April 2001. 

[15] M. Fidler and S. Recker. Conjugate network calculus: A dual approach applying the legendre transform. Computer Networks, 50(8):1026- 
1039, June 2006. 

[16] T. Hisakado, K. Okumura, V. Vukadinovic, and L. Trajkovic. Characterization of a simple conmiunication network using legendre transform. 

In Proc. Internationl Symposium on Circuits and Systems (ISCAS), pages 738-741, May 2003. 
[17] N. Hu and P. Steenkiste. Evaluation and characterization of available bandwidth probing techniques. 2 1(6): 879-894, August 2003. 
[18] V. Jacobson. Congestion avoidance and control. In Proc. ACM SIGCOMM, pages 314-329, August 1988. 

[19] M. Jain and C. Dovrolis. End-to-end available bandwidth: Measurement methodology, dynamics, and relation with TCP throughput. In 

Proc. ACM SIGCOMM, pages 295-308, October 2002. 
[20] M. Jain and C. Dovrolis. Pathload: A measurement tool for end-to-end available bandwidth. In Proc. Passive and Active Measurement 

(PAM) Workshop, pages 14-25, March 2002. 
[21] M. Jain and C. Dovrolis. Ten fallacies and pitfalls on end-to-end available bandwidth estimation. In Proc. ACM IMC, pages 272-277, 2004. 
[22] M. Jain and C. Dovrolis. End-to-end estimation of the available bandwidth variation range. In Proc. ACM SIGMETRICS, pages 265-276, 

June 2005. 

[23] Y. Jiang, P.J. Emstad, A. Nevin, V. Nicola, and M. Fidler. Measurement-based admission control for a flow-aware network. In Proc. of 1st 
EuroNGI Conference on Next Generation Internet Networks Traffic Engineering, pages 318- 325, April 2005. 

[24] S.-R. Kang, X. Liu, M. Dai, and D. Loguinov. Packet-pair bandwidth estimation: Stochastic analysis of a single congested node. In Proc. 
lEEEICNP, pages 316-325, October 2004. 

[25] R. Kapoor, L.-J. Chen, L. Lao, M. Gerla, and M. Y. Sanadidi. CapProbe a simple and accurate capacity estimation technique. In Proc. ACM 
SIGCOMM, pages 67-78, August/September 2004. 



24 



[26] S. Keshav. A control-theoretic approach to flow control. In Proc. ACM SIGCOMM, pages 3-15, September 1991. 
[27] J. Laine, S. Saaristo, and R. Prior. Rude/crude, http://rude.sourceforge.net/ 

[28] K. Lakshminarayanan, V. N. Padmanabhan, and J. Padhye. Bandwidth estimation in broadband access networks. In Proc. ACM IMC, pages 
314-321, October 2004. 

[29] J.-Y. Le Boudec and P. Thiran. Network Calculus A Theory of Deterministic Queuing Systems for the Internet. Springer- Verlag, Berlin, 
Germany, 2001. 

[30] J. Liebeherr, M. Fidler, and S. Valaee. Min-plus system interpretation of bandwidth estimation. In Proc. IEEE INFOCOM, May 2007. 
[31] X. Liu, K. Ravindran, and D. Loguinov. What signals do packet-pair dispersions carry? In Proc. IEEE INFOCOM, pages 281-292, March 
2005. 

[32] X. Liu, K. Ravindran, and D. Loguinov. A queuing-theoretic foundation of available bandwidth estimation: Single-hop analysis. IEEE/ACM 

Transactions on Networking, 15(4):918-931, August 2007. 
[33] X. Liu, K. Ravindran, and D. Loguinov. A stochastic foundation of available bandwidth estimation: Multi-hop analysis. IEEE/ACM 

Transactions on Networking, 16(2), April 2008. (To appear). 
[34] S. Machiraju, D. Veitch, F. Baccelli, and J. Bolot. Adding definition to active probing. ACM SIGCOMM Computer Communication Review, 

37(2): 19-28, April 2007. 

[35] B. Melander, M. Bjorkman, and P. Gunningberg. A new end-to-end probing and analysis method for estimating bandwidth bottlenecks. In 

Proc. IEEE GLOBECOM, pages 415^20, November 2000. 
[36] B. Melander, M. Bjorkmann, and P Gunningberg. First-come-first-served packet dispersion and implications for TCP. In Proc. IEEE 

GLOBECOM, pages 2170-2174, November 2002. 
[37] J. Naudts. Towards real-time measurement of traffic control parameters. Computer Networks, 34(1): 157-167, luly 2000. 
[38] J. Navratil and R. L. Cottrell. ABwE: A practical approach to available bandwidth estimation. In Proc. Passive and Active Measurement 

(PAM) Workshop, pages 1-11, April 2003. 
[39] V. Paxson. Measurements and Analysis of End-to-End Internet Dynamics. PhD thesis, Univ. of California, Berkeley, April 1997. 
[40] V. Ribeiro, R. Riedi, R. Baraniuk, J. Navratil, and L. Cottrell. PathChirp: Efficient available bandwidth estimation for network paths. In 

Proc. Passive and Active Measurement Workshop, April 2003. 
[41] R. T. Rockafellar. Convex Analysis. Princeton University Press, 1972. 

[42] A. Shriram and I. Kaur. Empirical evaluation of techniques for measuring available bandwidth. In Proc. IEEE INFOCOM, pages 2162-2170, 
May 2007. 

[43] A. Shriram, M. Murray, Y. Hyun, N. Brownlee, A. Broido, M. Fomenkov, and K. C. Claffy. Comparison of public end-to-end bandwidth 
estimation tools on high-speed links. In Proc. Passive and Active Measurement Workshop, pages 306-320, March 2005. 

[44] J. Sommers, P. Barford, and W. Willinger. Laboratory-based calibration of available bandwidth estimation tools. Microprocessors and 
Microsystems, 31(4):222-235, June 2007. 

[45] J. Strauss, D. Katabi, and F. Kaashoek. A measurement study of available bandwidth estimation tools. In Proc. ACM IMC, pages 39-44, 
2003. 

[46] M. M. Bin Tariq, A. Dhamdhere, C. Dovrolis, and M. Ammar. Poisson versus periodic path probing (or, does PASTA matter?). In Proc. 5th 

Conference on Internet Measurement, pages 1 19-124, October 2005. 
[47] S. Valaee and B. Li. Distributed call admission control for ad hoc networks. In Proc. of IEEE 56th Vehicular Technology Conference (VTC 

2002-Fali), pages 1244-1248, September 2002. 
[48] B. White and et. al. An integrated experimental environment for distributed systems and networks. In Proc. ofOSDI 2002, pages 255-270, 

December 2002. 

[49] D. Wu and R. Negi. Effective capacity: A wireless link model for support of quality of service. IEEE Transactions on Wireless Communi- 
cations, 2(4):630-643, July 2003. 

[50] Y. Zhang, N. Duffield, V. Paxson, and S. Shenker. On the constancy of internet path properties. In Proc. ACM SIGCOMM Internet 
Measurement Workshop, pages 197-211, November 2001. 



